13:03
2026-07-01
github.com
large-language-models
Pollux β a natively vector quantized LLM with 0.76 bits per parameter
Researchers introduced Pollux, a new class of decoder-only LLMs that use native Leech-lattice quantization to achieve 0.76 bits per parameter, compressing a 1B-class model into 76 MB of SRAM. The modeβ¦